skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Search for: All records

Creators/Authors contains: "Wiest, Olaf"

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

  1. Pretraining of NERF models on chemically related mechanisms significantly improves the performance compared to pretraining by larger, mechanistically dissimilar reaction datasets. 
    more » « less
    Free, publicly-accessible full text available May 14, 2026
  2. Free, publicly-accessible full text available September 22, 2026
  3. Chemical reaction data has existed and still largely exists in unstructured forms. But curating such information into datasets suitable for tasks such as yield and reaction outcome prediction is impractical via manual curation and not possible to automate through programmatic means alone. Large language models (LLMs) have emerged as potent tools, showcasing remarkable capabilities in processing textual information and therefore could be extremely useful in automating this process. To address the challenge of unstructured data, we manually curated a dataset of structured chemical reaction data to fine-tune and evaluate LLMs. We propose a paradigm that leverages prompt-tuning, fine-tuning techniques, and a verifier to check the extracted information. We evaluate the capabilities of various LLMs, including LLAMA-2 and GPT models with different parameter counts, on the data extraction task. Our results show that prompt tuning of GPT-4 yields the best accuracy and evaluation results. Fine-tuning LLAMA-2 models with hundreds of samples does enable them and organize scientific material according to user-defined schemas better though. This workflow shows an adaptable approach for chemical reaction data extraction but also highlights the challenges associated with nuance in chemical information. We open-sourced our code at GitHub. 
    more » « less
  4. The application of computational methods in enantioselective catalysis has evolved from the rationalization of the observed stereochemical outcome to their prediction and application to the design of chiral ligands. This Perspective provides an overview of the current methods used, ranging from atomistic modeling of the transition structures involved to correlation-based methods with particular emphasis placed on the Q2MM/CatVS method. Using three enantioselective palladium-catalyzed reactions, namely, the conjugate addition of arylboronic acids to enones, the enantioselective redox relay Heck reaction, and the Tsuji–Trost allylic amination as case studies, we argue that computational methods have become truly equal partners to experimental studies in that, in some cases, they are able to correct published stereochemical assignments. Finally, the consequences of this approach to data-driven methods are discussed. 
    more » « less
  5. A proline-squaraine ligand (Pro-SqEB) that demonstrates high levels of stereoselectivity in olefin cyclopropanations when anchored to a Rh2 II scaffold is introduced. High yields and enantioselectivities were achieved in the cyclopropanation of alkenes with diazo compounds in the presence of Rh2(Pro- SqEB)4. Notably, the unique electronic and steric design of this catalyst enabled the use of polar solvents that are otherwise incompatible with most RhII complexes. 
    more » « less
  6. Molecular representation learning (MRL) is a key step to build the connection between machine learning and chemical science. In particular, it encodes molecules as numerical vectors preserving the molecular structures and features, on top of which the downstream tasks (e.g., property prediction) can be performed. Recently, MRL has achieved considerable progress, especially in methods based on deep molecular graph learning. In this survey, we systematically review these graph-based molecular representation techniques, especially the methods incorporating chemical domain knowledge. Specifically, we first introduce the features of 2D and 3D molecular graphs. Then we summarize and categorize MRL methods into three groups based on their input. Furthermore, we discuss some typical chemical applications supported by MRL. To facilitate studies in this fast-developing area, we also list the benchmarks and commonly used datasets in the paper. Finally, we share our thoughts on future research directions. 
    more » « less